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Background 

Proteins often alternate between several conformations, 
e.g., active and inactive states of receptors, open and 
closed states of channels, etc. However, in many cases 
only one conformation is known. The prediction of 
additional (biologically-relevant) conformations of a pro- 
tein can provide more insight into its function in health 
and disease. We introduce the ConTemplate computa- 
tional tool for modeling putative conformations of a 
query protein with (at least) one known conformation 
by assuming that pairs of structurally similar proteins 
may also share similar conformational changes. A three- 
step procedure is used (Fig. 1): First, the protein data- 
bank [1] is searched for structurally similar proteins to 
the query [2]. Structure-based pairwise sequence-align- 
ments are built between the query protein and each of 
the structurally similar proteins. Second, other known 
conformations (i.e., different from those resembling the 
query) of these proteins are indicated [3]. Third, by 
using the alignments found in the first step, and model- 
ing on the structural templates found in the second, 
ConTemplate suggests new conformations for the query 
protein. 

Results 

We demonstrate the method with the kinase domain of 
the Epidermal Growth Factor Receptor (EGFR). Using the 
inactive conformation as our query, we reproduce the 
active conformation [4] with root mean square deviation 
(RMSD) of 1.76A, based on the query's structural similar- 
ity to the inactive conformation of Abl tyrosine-kinase [5], 
together with the known active conformation of the latter 



kinase [6]. The sequence identity between the two kinase 
domains is only 40%, and the fact that they share similar 
active and inactive conformations might not be obvious. 

Conclusions 

The idea of inferring new conformations of a protein of 
interest based on known conformations in related pro- 
teins is not new. However, to the best of our knowledge, 
ConTemplate is the first automated implementation of 
this approach. 
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Step 1 : 

Find structurally similar 
proteins 




Step 2: 

Indicate additional 
conformations for each 
structurally similar protein 




i 



Step 3: 

Model the query with the 
templates found in step 2. 
according to the structure 
based sequence alignment 




Query (inactive conformation of EGFR kinase 
domain I 

Target (active conformation of EGFR kinase 
domain) 

Inactiv e conformation of AM tyrosine-kinase 

Template ( active conformation of AM tyrosine 
kinase) 

Model (EGFR kinase domain, modeled using 
the active conformation of AM tyrosine- 
kinase ) 



Figure 1 ConTemplate methodology, demonstrated using the known structure of the EGFR kinase domain in its inactive conformation as a 
query and reproducing its active conformation; the RMSD between the active and inactive conformations is 4.1 7 A. Step 1: Selecting proteins 
with structural similarity to the query; only one is shown here. Step 2: Finding alternative conformations of the proteins detected in Step 1. The 
black arrows mark the regions with the main differences between the conformations. Step 3: Modeling putative new conformations of the 
query using the conformations detected in step 2 as templates; only one is shown here. The black arrows indicate the similarities between the 
model, template and actual known conformation in the main regions of the conformational changes. 




